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Abstract 

A coupling method and an analytic one allow us to prove new lower 
bounds for the spectral gap of reversible diffusions on compact manifolds. 
Those bounds are based on the a notion of curvature of the diffusion, like 
the coarse Ricci curvature or the Bakry-Emery curvature-dimension in- 
equalities. We show that when this curvature is nonnegative, its harmonic 
mean is a lower bound for the spectral gap. 

Introduction 

The study of the spectrum of the Laplace Operator on Riemannian man- 
ifolds has many applications in various domains of mathematics. A whole 
chapter of [5] is devoted to this issue. In this article, we take the conven- 
tion 

A = g ij ViVj 

for the Laplace operator. The spectral gap of A is the opposite of the 
greatest non-zero eigenvalue of A (the spectrum of A is discrete and non- 
positive). 

One way to estimate this spectral gap is to use the Ricci curvature, as 
we see it in the Lichnerowicz theorem (see [10)1. 

Theorem 1 (Lichnerowicz) Let M be a n- dimensional Riemannian 
manifold. If there exists K > such that for each x £ Ml, for each 
u £ TxM, we have Kic x (u,u) > Kg x (u,u), then the spectral gap Ai of A 
satisfies 

A, > -JL-K. 

n — 1 

Here we denote by Ric the Ricci curvature of M . 

Chen and Wang improved this result in 7 , using the diameter of the 
manifold in their estimates: 

Theorem 2 Let M be a compact connected n-dimensional Riemannian 
manifold, K be the infimum of the Ricci curvature on Ml and D be the 
diameter of M . Then if K > 0, we have the following bounds: 
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and if n > 1, 

nK 



Ai > 



(n-i) u- co ^(^^y\ 



And if K < 0, we have the following bounds: 



and if n > 1 , 



2D 2 I< 

Ai > 



D 2 ch 
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In [2], E.Aubry gives a lower bound for Ai when the curvature 
mc(x):= inf 

is close to a positive constant in the sense of L p norm with p large enough: 

Theorem 3 Let M be a complete n- dimensional Riemannian Manifold, 
p > § and A" > 0, smc/i t/iat 

f (Ric - K) p < +oo. 
T/ien A'l /las a finite volume and the spectral gap of A on M satisfies: 

n — 1 \ 

where C(p,n) is a constant only depending on p and n, and \\f\\ p = 
( f. M \f\ p \p 

\ vol(M) J ■ 

This allows a little negative curvature, which is not the case of our 
results. 

This article recapitulates and extends the results already stated in [13] 
and presents a coupling method, more adapted to discrete spaces than the 
analytic one. 

We show by a coupling method that another bound for Ai is the har- 
monic mean of the Ricci curvature. 

Theorem 4 Let M be a compact Riemannian manifold with positive 
Ricci curvature. Then we have 

1 < f dp(x) 



Ai J M Ricfal ' 

with dfj, = y^^Q , where vol is the Riemannian volume measure on M . 

This bound is often better than the Lichnerowicz one because the har- 
monic mean is better (and can be much better) than the infimum. But 
unfortunately we lose the factor. 

Merging the proof of Theorem [T] and an analytic proof of Theorem [4] 
gives us the following improvement: 
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Theorem 5 Let M be a Riemannian manifold with positive Ricci curva- 
ture and K = ml x ^M Ric (x). Then for every < c < K , we have: 

. ^ n 1 
Ai > -c + 



f dvol ' 

JM Ricfj;)-e 



Taking c = K gives us the Lichnerowicz bound or even better, while 
c = gives us Theorem 3] 

Our coupling approach is based on a notion of coarse Ricci curvature, 
introduced by Yann Ollivier in [T2], which uses the Wasserstein distance 
W\. A major step in our proof is the use of the coupling given by the 
following theorem: 

Theorem 6 Let M be a smooth Riemannian manifold, and F % be a 
smooth vector field on A4. Assume that there exists a diffusion process 
associated with the generator Lf = A/ + F 1 Vif . Let n{x,y) be the coarse 
Ricci curvature of the diffusion between x andy (see Definition MG)) . Then 
for any two distinct points x and y of M, there exists a coupling (x(t),y(t)) 
between the paths of the diffusion process starting at x and y which satis- 
fies: 

d(x(t),y(t)) = d(x, y)e~ # "(*W.vM)d« 

on the event that for any s £ [0,t], (x(s),y(s)) does not belong to the 
cut-locus of M. 

The contraction rate k(x, y) of this coupling behaves like the one 
of the coupling derived from the diffusion in C 1 path space defined by 
M.Arnaudon, K.A.Coulibaly and A.Thalmaier in [T] when x and y are 
close. We have a cut-locus problem that we will avoid by making a com- 
pactness assumption, which was anyway necessary to replace k(x, y) by 
its limit when x and y are infinitely close. 

The coupling method and the analytic one keep working when we add 
a drift to the Brownian motion, provided the diffusion is reversible. In 
this case, the generator takes the following form: 

L = iff tf (V<Vi - (V f <p)Vi) 

with tp a smooth function on M, and e~ v dvol is a reversible measure. 
We have then the following generalization of Theorem [5] 

Theorem 7 LetM be a compact Riemannian manifold and L — gff^VjVj 
(Vj</f)Vi) be the operator associated with a reversible diffusion process on 
M. Suppose that we have a curvature-dimension inequality in the sense 
of Bakry-Emery (see J3jj or J4$) with a positive curvature p and a constant 
and positive dimension n' , which is 

V 2 {f)(x)>p{x)T{f)(x) + ^L{f){xf. 

n 

Let R be the infimum of p. Then for every < c < R, we have 

n' 1 
Ai(L)>-^ T c + 



r d.7r(x) 
J Ai p(x) — c 

with d7r = j c c -y^°y Ql the reversible probability measure. 
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We try to generalize our coupling method to diffusions which are not 
adaptated to the metric g, that is, whose generator takes the more general 
form: 

L = i^'ViVj + F l V» 
2 

without having necessarily A lj = g 1 ^ anymore. We have a generalization 
of Theorem [6] only on the very restrictive condition: 

(H) »Vue TM, u l g ]k u J g lm u l X7 t A km = & g a X7 \A jk '+g 3l \7 \A ki +g kl \> \A lJ = 

and with a lower k(x,y) instead of n(x,y). Note that (H) is true for 
A lJ = g 13 , in which case we have k — k. 

We have the following generalization of Theorem [4] 

Theorem 8 Consider a diffusion process on a compact Riemannian man- 
ifold A4 which is reversible and satisfies (H). For every x in M, we set 
k(x) = infugT^Ai k(x,u). If we have k(x) > e > 0, then the spectral gap 
of L is at least the harmonic mean of k (with respect to the reversible 
probability measure n): 

In section [1] we present a short argument which shows how we can 
derive the harmonic mean from Theorem [S] In section [21 we define the 
coarse Ricci curvature for diffusions and construct our couplings, so it's 
where Theorem [6] is proved. In section [3] we present the proofs using the 
couplings and purely analytical ones for the harmonic mean bounds for 
the spectral gap. 

1 The harmonic mean in a nutshell 

The result and its proof presented in this section are a shortcut found by 
Yann Ollivier to obtain a harmonic mean from Theorem [S] 

Using a classical method, we will prove thanks to Theorem [6] the fol- 
lowing result, which is a weaker version of Theorem [4] 

Theorem 9 Let M be a compact Riemannian manifold with positive 
Ricci curvature, and f be any 1-Lipschitz function on M. Then the vari- 
ance of f is at most the average of — U . 

Indeed, the Poincare inequality states that Var M (/) < J ||V/|| 2 d/i, 
and the integral on the right hand side is at most 1 for 1-Lipschitz func- 
tions. In 11 , E.Milman shows that the converse is true i.e a control 
on the variance of Lipschitz functions (and even on the L 1 norm of 0- 
mean Lipschitz functions) implies a Poincare inequality, with a universal 
loss in the constants, under the hypothesis of a Bakry-Emery CD(0, oo) 
curvature-dimension inequality. 

Proof of theorem [9j We only have to prove the result for / regular 
enough, and use a density argument to get the result for non-regular /. We 
consider the semi-group P generated by the Laplacian operator. Then 
the limit of P t when t tends to infinity is the operator which associates to 
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/ the constant function equals to the mean of / (respect to the normalized 
Riemannian volume measure). So the variance of / is the limit of the mean 
of P'(/ 2 ) - (P'(/)) 2 when t tends to infinity. We have 

P*(/ 2 )-(P'(/)) 2 = f4- (^((^" S (/)) 2 )) ds = f P s (2||V(P*- s (/))f) ds 
Jo as Jo 

Integrating over M yields 

{P\f){x)-{P\f){x))f&vo\{x) = / / t 2||V(P t - 3 (/))(^)|| 2 d S dvol(x). 
m Jm Jo 

Thanks to Theorem]!)] by taking y very close to x, we have || V(P i_s (/))(x)|| < 

E Px [e-fa~ Sm -( x ^ dv \\Vf(Xt-s)\\], where the right hand side is the expec- 
tation of the term inside the brackets when X has the law Px of the twice 
accelerated Brownian motion on M starting at x. Using the convexity of 
the exponential function, and the fact that / is 1-Lipschitz, we get then 

, 2 



J M (P t (f 2 )(x)-(P t (mx)))dvol(x)<2f M ft (E Px [ e -/o- a a^> d «j) dsdvolfx 



Jl e -2( t - S )Ric(X (t „ a) „) du j dvQl{x) d& 



= 2/ 7 jM e- 2 ( 4 -^Wdvol(^)d S 
= Sm met,) dvol(a). 

We just have to take the limit when t tends to infinity and to divide by 
j M dvol(a;) to get the theorem. □ 

2 Coarse Ricci curvature for diffusions 
on Riemannian manifolds 

In this section, we introduce the Coarse Ricci curvature k for general 
diffusions and give an explicit formula. Then we construct the coupling 
of Theorem [6] we show why the (H) condition is needed and we define k 
when it is satisfied. 



2.1 Coarse Ricci curvature: definition and calcu- 
lation 

Following what is done in ,12] for Markov chains, we define the coarse Ricci 
curvature of diffusions as the rate of decay of the Wasserstein distance Wi 
between the measures associated with the diffusion and starting at two 
different points: 

Definition 10 Let M be a Riemannian manifold and P* be the semi- 
group of a diffusion on M . The coarse Ricci curvature between two dif- 
ferent points x and y is the following quantity: 



t^o td(x,y) 
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The Wasserstein distance Wi between two measures is the infimum on 
all the couplings of the expectation of the distance. Our coupling will be 
consrtucted thanks to optimal ones. 

To get an expression of this curvature only depending on the coeffi- 
cients of the generator of the diffusion, we need to make sure that the 
diffusion does not move far away too fast. 

Definition 11 A diffusion on M is said to be locally uniformly L 1 -bounded 
atxifJM >0,3v> 0,Vy e M\d(x,y) < 77, VO < t < 77, / d(x, z)A{5 y .P t ){z) < 
M. 

Remark 12 If M is compact, any diffusion is locally uniformly L 1 -bounded 
at each point ( it suffices to take M equals to the diameter of M in the 
previous definition). 

The following theorem gives an expression of K,(x,y). Recall that the 
generator of the diffusion is 

L(f) = ±A ij V i V j f + F i V i f 
with A symmetric and non-negative. 

Theorem 13 Take two distinct points x and y in M, such that A and 
F are continuous at x and y, and that the diffusion is locally uniformly 
L 1 -bounded at x and y. Assume that the distance between two points in 
the neighborhoods of x and y admits the following second-order Taylor 
expansion: 

\ 1 + e (/f V + zf V) 

d(exp x {ev),exp y (ew)) = d(x,y) 2 / v {1) . ( 2) . J2 (12) . a / e 2 n 

Then the coarse Ricci curvature between x and y is: 

n( X ,y) = -l?) F \ x) -l? F i {y) - q ^ A ^ + tii A31J2 M +tl (y^^gp^gj) 

Here the matrix S 4l i3 = A ilh (x)q\lf 1 A jlj2 (y)q^f 2 is diagonalizable with 
non-negative eigenvalues, since it is the product of two symmetric non- 
negative matrices, so S ll i 2 admits an unique diagonalizable square root 
R ll i 2 with non-negative eigenvalues, and the last term of the formula is 
simply _R\. 

Remark 14 We don't assume here that d is the usual geodesic distance 
on the manifold M, but only that it admits a nice second order Taylor 
expansion. For example, we can take d the Euclidean distance on the 
sphere S n embedded in R n+1 . 

Proof: The idea is to approximate the distributions P* and P* for small 
t by Gaussian distributions in the tangent spaces T X M and T y A4, and 
to approximate the distance by its second order Taylor expansion. We 
can describe the process x(t) starting at x in the exponential map by the 
equation: 

dX l (t) = B\(X(t))dW a (t) + F H (X(t))dt 



G 



where W{t) is a Brownian motion in R n , B i% ai (0)8 aia2 B i2 a2 (0) = A ili2 (x) 
and F H (0) = F l (x), and B and F' are continuous (because of the conti- 
nuity of A and F) and defined in a neighborhood of 0. Keep in mind that 
X(t) may not be defined for every t > 0, but we have x(t) = exp x (X(t)) 
when it is. We will approximate X l (t) by 



which has the Gaussian law J\f(tF(x),tA(x)). For small t, the ball K t 
of radius s /(2A i ^ (x)g ili2 (x) + 2)t\ ln(t)| of T^X is included in the def- 
inition domain of B and F' . We will show that X(s) remains in Kt for 
< s < t with probability 1 — o(t). Let T t be the exit time of X(s) from 
K t , and 



We want to prove that \\X t j [0 ,t] ||°o = sup se[o t] ||X t (s)|| < ^(2^1^ (a:)s n!2 (i) + 2)*| ln(f ) 
with probability 1 — o(t). 

We first prove that \\X t — JT*- - 1 1 [o,t] || oo = o(\A| In | ) with probability 
l-o(t). Wehaved(X t -X (0) )(s) = l s<Tt [(B ! t ,(X(s))-B , „(0))dlV <1 (s) + 
(F ft (X(s)) — F*(a;))ds]. Because of the continuity of B and F', we have 
d(X t -X(0))(s) = o(l)dW(s) + o(l)ds. We have || f° o(l)dw| , t ] ||oo = 
o(t) = o(^/t| ln(t)|). For each coordinate of the martingale U'(s) = 
J " l s<Tt (B'^ltMJ-B'afODdl^fu), we will apply the Doob inequality 

to the sub-martingale e AC/l . Indeed, we have £E[e AC/,(s) ] < ^A 2 E[e AC/,(s) ], 
with ci(t) = o(l) depending on the infinite norm on K t of (B 11 ai (X) — 

B ll ai (0))S aia2 (B' 2 a2 (X) - B' 2 a2 (0)) so E[e Acr(t) ] < e^ r^ . So the 
Doob inequality implies, by taking C2(i) = \J (3ci (i)ji| ln(t)| = o(^/f | ln(t) |), 

and taking A = ±^j, that P(sup [o t] \U\s)\ > c 2 (i)) < 2e~ 2 ^n*T = 

3 1 ln(t) 3 

2e 2 = 2i2 = (t). We deduce || C7| [o,t] II oo < o(^/tln(i)) with proba- 
bility 1 — o(t), so we have the same conclusion for \\X t — -X^ '|[o,t] ||oo- 

We have ||sF l (0)||oo = 0(t) = o(y/t\ ln(£)|), so it remains to prove 
that \\B i a (0)W a {s)\ [0it] \\ x> < ^/(2A i ^(x)g ili2 (x) + Tji\ ln(t)| with prob- 
ability 1 — o(t) . We can suppose we are working with an orthonormal basis 
of eigenvectors of A nl2 (x)g i2 i 3 (x). In this case, using the same method 



as above, we prove P(sup [o t] B i a (0)W a (s) < J (2Xi + ln(t)|) with 




probability 1 — o(t), and then if the inequality is true for all i, we get the 
announced result by summing the squares. 
Now we set 



We have H[d(x(t),exp x (Xt(t)))] = o(t). Indeed, if X(t) does not exit from 
K t (T t > t), then the distance is 0. If T t < t (what we have shown to 
occur with probability o(t)), we apply the Markov property, and using the 
local uniform L 1 -boundedness assumption, the conditional expectation 
of d(x(t),x) knowing (X(T t ),T t ) is smaller than M for t small enough. 



X (0), (() = B\(0)W a (t) + tF\x), 
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So E[d(x(t), exp x (X (t)))] < Mo(t) = o(t). So the Wasserstein distance 
between the distributions of x{i) and exp x (Xt{t)) is o(t). 

Of course, we can do the same for the process starting at y, and define 
r (0) , K' u Y t and Y t . 

We denote d the second order Taylor expansion of the distance: 



d(X,Y) = dix^ll+lY'X'+lf Y'+^^X^+q^Y-" v 



J2 +24 12) X l F^ 



Vt|ln(t)| 



Thesupremumof J(X,y)— d(exp(X),exp (y)) over iftxA^ iso( , 

_ I in(y*l ln (*)l)l 

o(t). So the Wasserstein distance between exp x (X t (t)) and exp H (yi(t)) 

differs of o(t) from the minimum over all couplings of E[d(X t (t), Yt(t))]. 

It remains to prove that we have a difference of o(t) between the solu- 
tions of the minimization problems of E[d(X t (t), Y t (t))] and E[d(X m (t), F (0) (i))]. 
The expectation and the covariance of X t (t) - X (0) (t) and Y t (t) - Y (0) (t) 
are o(t). Indeed, we have X t (t) - X (0 \t) = (X t (t) - X t (t)) + (X t (t) - 
X^°'(t)). The expectation and the covariance of X t {i) — X'°'(t) are o(t) 
because this quantity is f Q o(l)dW (s) + o(l)ds (the stochastic integral is 
a martingale, so its expectation at time t is 0). We have X t (i) — Xt{t) = 
when T t > t, which occurs with probability 1— o(t), and the conditional ex- 
pectation and covariance of Xt(t) — Xt(t) knowing Tt <t are 0(^/t\ \n(t)\) 
and 0(t|ln(t)|), so the expectation and covariance of Xt(t) — Xt(t) are 
o(t^/t\ ln(t) | ) and o(t 2 1 ln(t) | ) (so o(t) anyway). So the expectation and 
the covariance of X t (t) — X' ' (t) are o(t). 

The expectation and the covariance of X^(t) and Y^°\t) are O(t), 
so for any coupling of (X t (t), X m (t)) with (Y(T), Y (0) (t)), we have 



E[d(X t (t), Y t (i)) - d(X m (t), Y w (*))] 



d(x, y)E 



x^'(t)) + if\Yj>(t) - y(°»(t)) 

X (0)ll (t))(X t l2 (i) + X (0)l2 (i)) 



+<# 2) (*i(t) - i (D)i (t))(^(t) + y (0)J (t)) 
+ g 2) (X!(t) + x«*(t))(y/(t) - y(°W(t)) 



= <>(*)■ 



The last four terms above are o(t) because of the Cauchy Schwarz inequal- 
ity, which implies that for every family of random vectors Vt and Wt whose 
covariance matrices satisfy Cov(V t ,Vt) = o(t) and Cov(Wt,W t ) = 0(t), 
we have Gov (Vt,Wt) = o(t). So we have proved that Wi(P%,P£) = 
inf E[d(X (0) (t) , y (0) (t))]+o(t) . The laws of X (0) (t) and Y (0) (t) are M(tF(x) 
and N{tF{y),tA(y)), so we have 



tA(x)) 



E[d(X m (t),Y m (t))]=d(x,y) 



(0), 



+ T (9 < C 1 1 i a ^K^(*)+9^i ?, 'Hv)^(v)+2«g a V(x)F^(y)) 
+E[> ) (X<°> l ( i ) - tF»)(y<°>^) - 



We only have to minimize the last term, and the minimum is — ttr ( \J (x)ql^A^^ (y)ql 
according to Lemma [T31 below. 



(12) 
■3J2 
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So we have proved that 
W 1 (P t x ,P*) = d(x,y) 




\+o(t) 



which precisely means that 

K{x,y) = -lf ) F\x)-lf ) F\y)-Uq^XA ili2 {x)+qfXA jlj2 {y))+^ ( \j ' Ann (x) q ™AM* {y)q M 



.□ 



Lemma 15 Let A* 112 and B 3132 be two symmetric non-negative tensors 
belonging to E\ ® E\ and E2®Ez, with E\ and E2 two finite dimensional 
R-vector spaces, not necessarily of the same dimension. Let Dij be a 
tensor belonging to El (g) E 2 . Then the minimum ofE[DijX'Y 3 ] over all 
couplings between X of law jV(0, A) and Y of law jV(0, B) is 



tr (\ 1 '-/>>,,„/." />.. , ) 



Proof: The quantity to be minimized only depends on the covariance 
C ij = E[X 4 F j ] between X and Y (this quantity is C ij D i:j ). So our prob- 
lem is equivalent to minimizing C 13 Dij over the set of all possible C such 
that there exists a coupling between X and Y such that the covariance 
between X and Y is C. Since X and Y are Gaussian, C is the covariance 
of a coupling between X and Y if and only if 

A C 
C T B 

is a symmetric non-negative matrix (because there exists a Gaussian cou- 
pling having this covariance). This condition is equivalent to V(X*, Y*) G 
El x E2,Xt 1 A ill2 X* 2 +Y* 1 B 3132 Y* 2 +2X*C l3 Y* > 0, which is equivalent 

to V(X*,Y;),\X*C ij Y*\ < yJX* ± Ann X* 2 Y? Y*j 2 . In particular, 

this implies C G Im(A) <g) Im(B) (just take X* G Kcr(A) or F* G Ker(B) 
and remember Im(A T ) = (Ker(A)) ± ). 

Let m = rk(A) and n 2 = rk(B) be the ranks of A and B, using 
suitable bases of Im(^4) and Im(£>), we find "square roots" A' l a and B' 3 p 
of A and B, in the sense that A' 7 (ni) A' T = A and B ' L {n2 - 1 B' T = B (I (n) 
the scalar product on a canonical n-dimensional Euclidean space and 7^™' 
the associated scalar product in the dual of this space). Then A' and B' 
admit left inverses A'" 1 and B'~ (here we don't necessarily have unicity, 
we just choose two left inverses). We set C' = A'~ 1 CB'~ 1T . We have 



A C \ ( A 1 - 1 \ ( A C \ ( A'- 1T 

C T B ~ \ B'- 1 [ C T B [ B'~ 



A'- 1 








B'- 1 




c 




7(12) 



(because A' A 1 ' 1 restricted to lm(A) is the identity, and likewise for B' B'^ 1 ). 
So we have reduced the problem to the case where E[ and E' 2 are Eu- 
clidean spaces of dimensions n\ and ri2, with A = I^ ni \ B = /(™ 2 ' and 
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D' — A' T DB' instead of D. There exist two nice orthogonal bases so that 
the matrix of D' in the associated dual bases has the following form: 

, / diag(Ai,...,A r ) 
\ 

with diag(Ai, . . . , A r ) the diagonal matrix with coefficients Ai, . . . , \ r , \k > 
and r = rk(D') = rk(ADB), and furthermore, we have unicity of the co- 
efficients Afc. This result can be proved thanks to the polar decomposition. 
We have 



C 



IT 



jZ.) ) >0^||C'|| OP <1 



with |j C' || op the operator norm of C associated with the Euclidean norms, 
hence the coefficients of C' are greater than or equals to —1. The minimum 
of C' al3 D' a p is then — Y7k=i ^ fe > anc ^ ^ s attained when the matrix of C" in 
the nice bases is 

-I T 

C" 



c 



with 1 1 C" 1 1 op < f, and only for those ones C. 

The endomorphism 7 (ni) D' '7 ( ™ 2) D' T has the eigenvalues \\ and with 
multiplicity m— r (it's matrix in the nice basis is diag(A? , . . . , A^, 0, . . . , 0)). 
We have 

7 (nO D , 7 (n 2 ) d ,t = I ^i) A 'T DB 'l < - n ^B' T D T A' = I M A' T DBD t A' . 

For any two matrices M, N of size p x q and q x p, we have for every 
n G N, tr((MiV) n ) = tr((NM) n ), so MN and NM have the same eigen- 
values with the same multiplicity, except for the eigenvalue 0, where the 
difference of the multiplicities is \p — q\. So the matrix 

A'l {ni) A'DBD T = ADBD T 

also has the eigenvalues A| and with some multiplicity. The Afe are 
then the non-zero eigenvalues of v ADBD T . So the minimum we were 
looking for is - Y7 k =i x k = - tr(V ADBD T ) (= - tr(VBD T AD) so the 
symmetry between A and B is respected, which was not straightforward 
by looking at the formula). □ 

The two following remarks provide a good understanding of what the 
set of the solutions of our minimization problem look like. 

Remark 16 The set of all possible covariances is convex and compact, 
and the quantity to minimize is linear, so the minimum is attained at 
an extremal point of this convex set. Suppose n\ > ni, then in the case 
of an extremal covariance, the coupling between X and Y has the form 
y J = M j iX\ Indeed, C i-> A'- 1 CB'- 1T restricted to Im(A) <g> Im(B) 
is linear and bijective, so C is an extremal covariance if and only if C' 
is an extremal tensor of operator norm smaller than or equals to 1. we 
know that for any tensor C' there exists two orthogonal bases in which the 
matrix of C' can be written: 



C' = 



diag(^i, . . . ,/i„ 2 ) 




10 



with fit > 0. The operator norm of C is then maxi<fe<„ 2 \nk\- 

So C is an extremal tensor of norm at most 1 if and only if = 1 
for every k. 

Indeed, if at least one is strictly smaller than 1, C" is a non-trivial 
convex combination of the tensors whose matrices in the same basis are 

( diag(ei,...,e„ 2 ) \ 
I ) 

with Ei = ±1, and each of these tensors has operator norm 1. And con- 
versely, if Hk — 1, then C is an extremal tensor of norm at most 1. 
Assume that C = tC m + (1 - i)C (2) with t e]0, 1[, and C (1) and C (2) 
have an operator norm smaller than or eqals to 1. Then the matrices 
of C (1) and C (2) have coefficients smaller than or equals to 1, so their 
coefficients on the "diagonal" must be 1. The coefficients outside the "di- 
agonal" are because the sum of the squared coefficients on each row and 
each column is less than 1. So C^ 1 ' = C' 2 ' = C . 

If C is an extremal covariance, we have C' T I (ni) C' = I ( ™ 2) (just 
do the product of the matrices in the nice bases). We set then M = 
c t a ,-it / (ni)J 4'-i = B'C' T /( n _i) A'' 1 . The covariance ofY - MX is 

B-MC- C T M T + MAM T =B - [B'C' T I (ni) A'- 1 ][A'C'B' T ] 

~[B'C' T A' T ][A'- 1T I (ni) C'B' T ] 
+[B'C' T I (ni) A'- 1 ][A'I^A' T ][A'- 1T I (ni) C'B' T ] 
=B - B'C' i 'l {ni) C'B' T = 0. 

So Y — MX as previously said. 

Remark 17 For any solution C of our minimization problem, we have 

CD T C = A'C'B' T D T A'C'B' T = A'C'D' T C'B' T = A'I^D'I^B' T 
= A'I (ni) A' T DB'l(" 2) B' T = ADB. 

In particular, we have (CD T ) 2 = ADBD T and (D T C) 2 = D T ADB. If 
we take Co the solution which corresponds to C" = 0, Co is the unique so- 
lution with minimal rank (hence the optimal coupling with "the least corre- 
lation" between X and Y). We have rk(C ) = rk(ADB) = rk(A DBD T ), 
sork(C D T ) < rk(ADBD T ), and furth ermore tr(C )D T = - tv(VADBD T ), 
hence we have CoD T = —\/ADBD T , and in a similar way D T Co = 
-\fWADB. Since C D T C = ADB, we have Im(C ) D lm(ADB) and 
Im{C T ) D lm(BD T A), and we have in fact equalities because these matri- 
ces have the same rank. As ADB, ADBD T and D T ADB have the same 
rank, there exist E and F (which will play the role of D~~ 1T ) such that 
ED T ADB = ADB = ADBD T F, and then Co is given by the formula 

Co = -Vadbd t f = -eVd t adb. 

For the other solutions, we have 

C-C = A>(1 
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The condition ||C"||op < 1 is equivalent to the positivity of 

/ \ 
7 ni _ r C" 


\ C" T In 2 - r J 

which is equivalent to the positivity of 

(A- CoB- 1 ^ C - Co \ 
\ (C~C ) T B-CfA-'Co ) 

where A' 1 = A'- 1T J (ni) A'- 1 and B' 1 = B'' 1T I^B'' 1 . But we would 
find the same results for the products CoB~ 1 Cq and Cq A~ 1 Co by taking 
A^ 1 and B^ 1 such that AA" 1 A — A and BB~ 1 B — B, so this does not 
depend on the choice of A' , A'^ 1 , B' or B'^ 1 . 

We can split Irr^A) as the direct sum oflm(ADB) and the orthogonal 
(for the quadratic form induced by A on Im(A) ) of this space (which can 
be written as Im(j4) n Kei(BD T )). The two matrices CoB~ 1 C r and A — 
CoB~ 1 C r correspond to the decomposition of A on this two subspaces. 
The similar remark is valid for the matrices Cq A~ 1 Co and B — Cq F A~ 1 Cq 
with respect to the decomposition of Im(£>) as the sum of lm(BD A) 
and its orthogonal. An optimal coupling is then any coupling of X and 
Y satisfying that the covariance between the orthogonal projections (with 
respect to A and B) of X and Y on Im(ADB) and lm(BD T A) is Co- 



2.2 The limit of k(x, y) when x and y are close 

Let us look at what the formula given by Theorem 1 131 for n(x,y) becomes 
when we take y = exp x (Su), d the usual geodesic distance on Riemannian 
manifolds and when 5 tends to 0. We have the following result that gives 
the second order Taylor expansion of the geodesic distance on Riemannian 
manifolds. 

Lemma 18 Let x £ M, (u,v,u>) G (TxM) 3 such that gijU b u j — 1, 
y = exp x (Su), w' G T y M obtained from w by parallel transport along 
the geodesic t h-> exp x (Stu) . Then we have for fixed small enough S, the 
following Taylor expansion in e: 

/ 2 

d(exp x (ev),exp y (£w')) = S (l + ^ugij(w j — v j ) + (r^ViVj + r < * 2) w 1 w j + 2r-* 2 ' V 'w j + 0(e s )j 
with 

fy 5 = 9n ~ 9ikU k u l gi 3 - ^-R kilj u k u l + o(5 2 ) 
r if = 9ij ~ 9ikU k u l gij - ^R kil jU k u l + o{5 2 ) 
r-j 2) = -ga + g t ku k u l gij - ^-R kaj u k u l + o{5 2 ), 

where Rkiij is the Riemann tensor of the manifold, andr^ 'wV = ry itV — 
ry^uV = r^f^vPv 1 = (and not only o(S 2 )). 

Proof : We will take S small enough such that (x, y) does not belong to 
the cut-locus. Then the Riemannian distance is smooth on a neighborhood 
of (x,y). 



12 



For the term in e, the well known fact that the sphere of center x 
and radius S is orthogonal at y to the geodesic joining x to y gives us 
that the part of this term depending on w is proportional to gijU l w 3 . A 
similar argument holds for the term in e depending on v. Taking v and w 
proportional to u give the two constants, so we have the term in e. 

For the term in e 2 , we only show that it does only depend on the 
orthogonal projections of v and w on the orthogonal of it, the proof of the 
behaviour in S being based on tedious calculations. We define T, x as the 
image by the exponential map at a; of a small ball of the orthogonal of u, 
and T, y as the image by the exponential map at y of a small ball of the 
orthogonal of it'. For e small enough, the geodesic between x\ — exp x (ev) 
and yi = exp y (ew') intersects T, x and E H at X2 = exp x (evi) and y 2 = 
exp y (ew[) (we may have to extend the geodesic of O(e) beyond x\ and 
yi). We have: d(x\, y\) — ~d(x\, x 2 ) + d(x 2 , y2) +d({/a, yi) with d(xi, x 2 ) = 
—d(xi,x 2 ) if we needed to extend the geodesic beyond Xi and d(xi,x 2 ) 
otherwise, and the same for d(y 2 ,yi). We also have v± — v 2 + 0(e) and 
wi — w 2 + 0(e), where v 2 — v — (u,v)u, w 2 = w — (u,w)u, and the 
O(e) are orthogonal to u. Since in the exponential map, the variation of 
the metric is of order 2, we have d{xi,x 2 ) — e\\v — ui 1 1 ( 1 + 0(e 2 )), and 
d(y2,yi) = e\\w — «i ||(1 + 0(e 2 )). So we get d(xi, x 2 ) — —e(u, v) + 0{e 3 ) 
and d(y 2 ,yi) = e{u,w) + 0(e 3 ), so we find the terms in e we expected, 
and no terms in e 2 . As vi and wx are orthogonal to u, we get d(x 2 ,y 2 ) = 
d(exp x (v 2 ), exp y (w' 2 )) + 0(e 3 ). So the e 2 term does only depend on v 2 
and w 2 as wanted. □ 

From Theorem 1131 and Lemma [TBI we get: 

Theorem 19 Suppose we have a diffusion process on a manifold A4 such 
that A and F are C , rk(A) = n everywhere, locally uniformly L -bounded. 
Then k(x, exp x (5u)) converges to 

k(x,u) : = -ugijU k V k F 3 + ^R kl i j A ii u k u~^u i \/ i A al3 ®A + A®~cr T ' ) } * TUV~A 

when 8 tends to 0. 

Here, for any M G T X M ® T X M, we denote by M the canonical 
projection of M to (T X M/Vect(u)) ® (T X M /Vect(u)) , and the tensor 

T l3 u = ((p 1 0A + A® g^A 

V y ijkl 

is uniquely defined by the relationship: 

m / T-iln—rhn —rim —kn\ ,- TJ) 

T ljkl [g-^ A +A J 3-1 

The contraction Tij k iM ]k is the unique matrix Nu such that ANg^ 1 + 
g^NA = M. 

Remark 20 In the special case A ZJ = g lJ , we find the usual curvature of 
the Bakry-Emery theory: 

k(x,u) — —(u,V u F) + iRic(u, u). 
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Proof: The hypothesis that A and F are C 1 gives us that the parallel 
transport of A(y) and F(y) along the geodesic are A 13 (x) + Su k X7kA lJ (x) + 
0(8) = A iJ {x) + 8E ij {8) where E ij (8) tends to u k V k A iJ {x) when 8 tends 
toO, and F i {x) + 5u k V k F i (x) + o(<5). The application of Theorem H51 and 
Lemma IT51 leads to 



K (x,exp m (5u)) = ^|ii(F J - (F J + 5u fc V fe F J + o(<5))) 



-(^ + f£«(5))( Si . 



giku k u l gij 



trR ki ljU k U l ) 



- t^Ar^){8)(A + 8E(8))r( 12 > T (S)) 



+ o(l). 



The difficult point is to understand the behaviour of the square root when 
8 tends to 0. The quantity under the square root tends to (A lk (gkj — 
gkiu k u l gij)) 2 , which is of rank n — 1 (its kernel is Vect(-u)). The square 
root of matrices is an analytic function in a neigborhood of matrices with 
positive eigenvalues. This is why we quotient the space T X M by Vect(it) 
(thanks to Lemma [T8l we know that r' 12 ^ £ ii x <g> u ± ). 

We need the second-order Taylor expansion of tr( \/ M 2 + eN) with M 
a diagonalizable matrix with positive eigenvalues. We have V M 2 + eN = 
AI+eH+e 2 K+0(e 3 ), so we have HM+MH = N and H 2 +KM+MK = 
0. If we work in a diagonalization basis of M (with Xu) the eigenvalues 
of M), we get: 



A(i) + A 



and 



So we have: 



K" 



A(i) + A 



E 



N' 



N 



A(j) + Affci Afji + A 



(fc) A (i ) 



tr(*0 = E 



2A 



(0 



= itr(M- 1 JV) 



and 



tr(A') 



•E 4 

4(A (i) +A 0) )^ 



J 2A (<) (A W +Ay ) )* 



+ 



«,3 4A (i) Aj(A (i) +A 0) ) 

-itr(M- 1 7V((7(g)M + M( 



We only have to apply these results with M 
and N = A(g - 



■ T g)E(S)(g-guu T g) + l(AR(u)A(g 
guu T g)AR{u)) + 0(8), where Rij(u) = Rkijiu k u l £ u 1 



I)- 1 (M~ 1 N)). 
A(g - gnu 7 



- guu 



guu T g) + A(g- 
® u 1 - ■ We obtain: 



trj^Arm (S)(A + 6E{6))r<.Wj6)) = tx(A(g - guu 1 ' g)) + § tr((£(5) + §fl(u))(g - ff» T fl)) 
tr(V„A( fl - guu T g)((I ® M + M ® /^(V^fo - pim T ff )))) + o(<5 2 ). 



We have tr(AR(tt)) = tr(Ai?(w)), and the last term can be written 
— 4f tr(V u ^l((^4®<7 _1 + <7 _1 <8>j4) _1 V tl ^4)) because the inverse of g — guuFg 
(acting on T X A4 /Vect(u)) is g^ 1 . Replacing this expression of the trace 
of the square root in the expression of k(x, exp x (Su)) cancels the terms of 
order and |, and we get the announced result. □ 
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Remark 21 The dependency on u of the last term of the formula for 
the curvature is generally not quadratic (because of the complicated depen- 
dency onu of the tensor [A®g~ 1 -\-g~ 1 ®A)~ 1 ), but is always non-positive 
and greater than or equals to the same expression without the bars (which 
we would have obtained by using the Wi distance instead of the Wi in the 
definition on k, and this expression without the bars depends on u in a 
quadratic way). 



2.3 Construction of the coupling 

Now we will construct a coupling between the paths of the diffusion pro- 
cess thanks to the optimal coupling in the tangent spaces. In the case 
when A is invertible everywhere on M, we have v\i{A{x)q^ 12 \x,y)A{y)) = 
n—1. According to Remarks ll6l and ll7l we have two extremal covariances 
C + (x, y) and C~ (x, y) in the set of the covariances of optimal couplings, 
given by the formulas: 



C+(x,y) = -y/A(x)qM (x, y)A(y)q^(x, y)p{x, y) 



+ 



y/u T A(x)- 1 uu' T A(y)- 1 u' 



c 



0,2/) = -\J A(x)q( 12 )(x, y)A(y)q( 12 '> T (x, y)p{x, y)- 



u T A(x)- 1 uu' T A(y)~ 1 u' 

with y = exp x (Su) with S small enough, u' the parallel transport of u 
and p(x,y) any matrix such that A(x)q 12 ' (x, y)A(y)q^ 12 ' T (x, y)p(x, y) = 
A(x)q( 12 ' (x, y)A(y). The extremal covariance C + (x,y) tends to A(x) 

_ T 

when y tends to x, whereas C (x, y) tends to A(x) — 2 u ta( x )-i u wnen u 
stays fixed and 5 tends to 0, so the coupling with C + (x, y) generalizes the 
coupling by parallel transport, whereas the one with C~(x,y) generalizes 
the coupling by reflection introduced by Kendall in [9]. Here we will use 
C + to construct our coupling for Theorem [6] because the behaviour of 
C~ when 8 tends to is irregular. 

So we can construct a coupling between the paths as a diffusion process 
on A4 x M (at least in the neighborhood of the diagonal), whose generator 
is defined by: 

L+(f)(x,y) = ^A(xy^ 2 11)lJ f(x,y) + A(yY^ 2 22)lJ f(x,y) 

+ 2C+^(x,y)\7 2 im f(x,y)} + F>(x)V w J(x,y) + F*(y)V {2) J(x,y). 

The coupling above in the case A = g' 1 is the one of Theorem [6] 
Proof of Theorem [6j Let us consider the diffusion process of infinitesi- 
mal generator L + , which is well defined outside the cut-locus of M . In the 
special case of compact Riemannian manifolds, this is true when d(x, y) is 
strictly smaller than the injectivity radius. To get the infinitesimal vari- 
ation of d(x(t),y(t)), we have to compute L + (f) where / has the special 
form f(x,y) = <p(d(x,y)) with ip regular enough (C 2 ). We have: 

V(i)i/(a;,y) = tp' {d(x,y))V {1)i d(x,y) 
V( 2 )i/(a;, y) = <p'(d(x, y))V( 2 )id(x, y) 

Vni)ijf&,v) = ¥'(d(x,y))V 2 11)i:j d(x,y) + ifi"(d(x, y)) V (1)i d(x, y) V (1)j d(x, y) 
V (i2)ij/( a; I y) = V 3 ' (d(x,y))V , l2)ij d(x,y) + tp"(d(x,y))V Wi d(x,y)V Wj d(x,y) 
V(22)ijf( x ,y) = f' {d{x,y))V (22)ij d{x,y) + tp"(d(x, y)) X7 (2)t d(x, y) V {2) jd{x, y) 
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with, according to Lemma [18] 



V {1)i d(x,y) = -gij(x)u 3 (x,y) 

V (2 )id(x,y) = -g i;j (y)u 3 (y,x) = gn^u' 3 (x,y) 

vfii)a d ( x >y) = d ( x >y)i { pj( x >y) 

V(i2)<j«J(x,3/) = d(x,y)ql 12) j(x,y) 
V( 32 )ijd(a:,y) = d(x,y)q ( i 2) j(x,y). 



Thus we get: 



y(d(i,y)) 



-d(x,y)ip'(x,y)K(x,y). 



A 13 (x)g ik {x)u k (x, y)gji (x)u l (x, y) + A' 3 (y)g lk (k)u k (y, x)g jt (y)u l (y, x) 
+2C +ij (x, y)g ik {x)u k (x, y) gj i {y)u l (y, x) 



Since we have A — we get C + (x, y) = Cq(x, y) — u(x, y)u T (y, x), with 
Co G g~ 1 (x)u(x,y) ± <Sig~ 1 (y)u(y,x) ± . So the term containing ip"(d(x, y)) 
is 0, which means that the variance of d(x(t),y(t)) is o(t) when t tends to 
0. So dd(x(t),y(t)) = -d(x(t),y(t))K(x(t),y(t))dt. Then by integration 
of this equality, we get: 



d(x(t),y(t)) = d(x(0),y(0))e- % *(Ms),yW)^ n 



2.4 The (H) condition and the curvature R 

The variance term of this optimal coupling is generally not in the case 
when A ^ g^ 1 (nor a multiple of g )■ So we can try to use another cou- 
pling, by replacing C + (x, y) with C(x,y), which is the optimal covariance 
(for the distance) under the set of covariances which cancel the variance 
term of d (if this set is non-empty). 

We will prove this set is non-empty if and only if the condition 

(H) <=> Vu e TM, ug jk u 3 g lm u l \7 t A km = 

is satisfied. 

Indeed, the variance term is always nonnegative, so it may vanish if 
and only if its minimum is 0. This is equivalent, according to Lemma [151 
to 

2 tr( y/A(x)g(x)u(x, y)u T (y, x)g(y)A(y)g(y)u(y,x)u T (x, y)g(x)) 
= v- T (x,y)g(x)A(x)g(x)u(x,y) + u T (y,x)g(y)A(y)g(y)u(y,x), 

which is equivalent to 

u T (x,y)g(x)A(x)g(x)u(x,y) = u T (y,x)g(y)A(y)g(y)u(y,x) 

(this is the equality case in the inequality between arithmetic and geo- 
metric mean). Differentiating this condition with respect to y along the 
geodesic starting at x in the direction u gives the condition (H), and of 
course the converse implication is obtained by integration. 

The hypothesis (H ) is a very strong hypothesis: for a given metric, the 
set of the possible A which are nonnegative and satisfy (-ff) is a convex 
cone of finite dimension. Indeed, H is equivalent to: for every geodesic 
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j(t), A(j(i))(g~ lA f(t))® 2 is constant. We choose x £ M, and we take 
a family of vectors U( k ), k = 1, . . . , "^" 2 +1 - ) such that {w®^} is a basis of 
the symmetric tensors of T 2 M. Then we take X(k) = exp x {eU( k )), with e 
small enough to have ||ew(fc) | < r, with r the injectivity radius of M. Then 
there exists a ball B centered at x such that for every y £ B and every k, 
there exists an unique minimal geodesic joining y and Xi k ^, with velocity 
V((k) at y, and {w^.)} is a basis of the symmetric tensors of T y M. The 
knowledge of A at the points X( k ) is sufficient to uniquely determine A on 
the ball B. For any z £ M, we have x = exp z («) for some v £ T Z M. We 
can find a family of vectors v^) £ T Z M in a neighborhood of v such that 
{ v (k)} i s a Das i s of the symmetric tensors of T 2 M, and exp z («( fc j) £ B. 
Then the knowledge of A on the points X( k ) uniquely determines A on M.. 

This argument also shows that A is smooth, and the second order Tay- 
lor expansion of A in the neighborhood of a single point is sufficient to 
determine A one the whole manifold. The condition (H), and the equa- 
tions obtained by differentiating it twice show that this Taylor expansion 
must belong to a subspace of dimension ("+ 2 ) _ 

The following examples give the set of the possible A in the cases when 
M is an Euclidean space of dimension n, the sphere of dimension n or 
the hyperbolic space of dimension n, providing examples where (H) is 
satisfied without having A — g lJ . 

Example 22 In all three cases mentionned below, M can be considered 
as a submanifold of E = R n+1 such that the geodesies are the intersection 
of M. and a two dimensional vector subspace of E. Let (ei, . . . , e„+i) be 
the canonical basis of E and (e*, . . . , e* +1 ) be the corresponding dual basis 

• We take M equal to the affine hyperplane of equation e* +1 (:r) = 1, 
equipped with the Euclidean metric Y^=i e t 2 * ra the first case. 

• We put the scalar product s — e t 2 071 E, and we take M equal 
to the sphere s(x,x) = 1, equipped with the metric induced by s in 
the second case 

• We put the quadratic form q = X)™=i e »* 2 — e n+i 2 on E, and we take 
M — {x\q(x,x) = —1 and e* n+1 (x) > 0}, equipped with the metric 
induced by q in the third case. 

Then we take T £ £)*® 4 any tensor with the same symmetry as a Riemann 
tensor, that is, T must satisfy Tij k i = —T jik i — —Tiji k = T k uj and the 
Bianchi identity T iik i + Tj ka + T ki ji = 0. We construct the tensor field A 
on M in the following way: let (x,v) £ TA4, we want to have 

A(x)(g- 1 vf 2 =T ijkl x i v j x k v l 

where the sense of the right hand side is given by considering x and v 
as elements of E. The quadratic dependency in v is trivial, so A is well 
defined by the previous equation. Let us consider a unit speed geodesic on 
M, joining two distinct points x and y, and v and w be the speed vectors 
of the geodesic at points x and y. As said above, the geodesic is included 
in a two dimensional subspace of E, so (x,v) and (y,w) are two bases of 
this subspace. Thus there exists a matrix 

»-{'c 
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such that y = ax + bv and w — cx + dv. Then we have T(y,w,y,w) = 
det(M) 2 T(x, v, x, v) (that is a classical property of the Riemann tensor). 
If I is the length of the geodesic, we have: 

M = ( ^ | J in the case of the Euclidean space, 



1 

[l 

sm(l) cos(i) 

M ~ ( °v!m S ^urr\ I * n case °f hyperbolic space. 



, cos(i) sin(i) , 
M = I . /,n /,( I in the case of the sphere, 



h(l) ch(l) 



In each of the three cases, we have det(M) = 1. Thus we have A(x)(g~ 1 v)® 2 — 
A{y)(g~ 1 w) m as wanted. 

The linear application T i-> A is infective, so the dimension of its 
image is ("+ 2 ) ) w hich is the maximal dimension of the vector space 

of symmetric tensor fields on M satisfying the hypothesis (H). Thus this 
image is exactly this vector space. But the tensor fields A which interest 
us are nonnegative on A4, and this implies some restrictions on T. In the 
cases of the Euclidean space and the sphere, it is true if and only if the 
"sectional curvature" associated to T is nonnegative, whereas in the case 
of the hyperbolic space, it is true if and only if this "sectional curvature" 
is nonnegative on the planes whose intersection with the cone q(x, x) = 
is not {0}. 

In the case when (H) is satisfied, the covariances which cancel the 
variance term of d take the form: C(x,y) = Co(x,y) + C'(x,y), with 

Q r x y \ = ^(z)ff (x)u(x, y)u T (y, x)g{y)A{y) = A(x)g(x)u(x,y)u T (y, x)g{y)A(y) 
u T (x,y)g{x)A(x)g(x)u{x,y) u T (y,x)g(y)A(y)g(y)u(y,x) 

and C'(x,y) is such that the big matrix: 

A'(x,y) C'(x,y) 
C' T (x,y) A'(y,x) 

is nonnegative, with A'(x,y) = A(x) - M^)u(xy)uJ { xy) g (x)A(x) ^ 

& ' \ ' tfV V / u 1 (x,y)g(x)A{x)g(x)u(x,y) 

Using Lemma [TS] again gives us the following expression of k(x,y): 

F{y)g(y)u(y,x) - F{x)g (x)u( x y) - \{tr{A(x)q ( - 1) {x, y)) + tr(4(y)g (2) (x, y))) 
tr(C (x, y)q {12)T {x, y)) + ti(y/A'{x, y)qM (x, y)A'(y, x)q^(x, y)) 



We can define k(x,u) as the limit when S tends to of k(x,exp x (Su)). 
Then we have: 

„"i _ „ fc V, E j 4- ±A ij R-, -,ll k ll l - ^9 t] u k \J k A' l g lm u n V n A ma g op 

-\B*{A' ® (g- 1 - uu T ) + (g- 1 - uu T ) ® AT k ^B kl 

with 

A' — A — A s^ uT 9 A 

u 1 gAgu 

j^jij ^ V u Aguu^ g A-\-Aguu T gV U A 

u u T gAgu ' 

and as A', B and g~ x — uu T belong to (7 _1 u x ® (7 _1 u x , we take {A 1 (g) 
(g" 1 — uu T ) + (g^ 1 — uu T ) ® A')' 1 the unique inverse of A' ® (g^ 1 — 
uu T ) + (g- 1 - uu T ) ® A' in (T* X M /Vect(gu)) 04 . 
And we have the equivalent of Theorem [6] 
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Lemma 23 If the hypothesis (H) is satisfied, then there exists a coupling 
between paths such that 

d(X(t), Y(t)) = d(X(0), r(0))e" J o ^(shnsVds 

almost surely on the event that for every < s < t, d(x', y') 2 is smooth in 
a neighborhood of (X(s),Y(s)). 

3 New bounds for the spectral gap 

The idea to prove Theorems [4] and [8] is to look at the exponential decay of 
the Lipschitz norm of P f when / is Lipschitz with mean (respect to the 
reversible probability measure) . Then we use the reversibility assumption 
to conclude that the variance of P f also decreases exponentially fast with 
the same rate, which is hence a lower bound for the spectral gap. 
Proof of Theorem [8j Let x and y be two points of A4 such that 
d(x,y) < ri where r\ is the injectivity radius of M. We have P t f(y) — 
P\f{x) = E[f(Y(t)) - f(X(t))] for any coupling between paths. If / is 1- 
Lipschitz, then \ f(Y(t))-f(X(t))\ < d(Y (t) , X (t)) , so Lemma [23] tells us 
that \P t f(y) - P t f{x)\ < d(x,y)E[e~ti s ( x ( s ). y (=)) ds ]. F or any e > n > 0, 
there exists 8 > such that for all (x',y') such that d(x',y') < 8, we 
have k(x',y') > k(x') — n, where k(x') = inf„ e T ,Mk(x',u). Taking 

T = we have d(X(T),Y(T)) < 8, and then for t > T we have 

Eje-Jo^PWW)^] < ± E[e ~^- T ^x {8)) ^ ) ^ Following what was 
done in |], we use the Feynman-Kac semigroup F* generated by K, with 

Kf(x) = \a^(x)V%}{x) + F^x^ifix) - (k(x) - r,)f(x). 

Indeed we have E[e" Jo ( s ( x ( s »-") ds ] = F f l(x). The Lipschitz norm of 
P*f is at most sup xeM ^-E[e" s(x(s),y(s))dsj^ for eyery f > T TMg 
quantity is sup xeM — P TF t ~ T l(y), so it is smaller than or equals to 

8 u d(8 x .P T ) u ,,,-t-Tni 

Then the Lipschitz norm of P f decreases exponentially fast with a better 
rate than the one of F t l. 

The L -norm of F* 1 decreases exponentially with rate inf ^ r h 2 d7T=1 J —hKhdn > 
inf h | / h 2 d7r=1 X 1 Va,r^(h)+J(k(x)~n)h(x) 2 dTY(x) = Xi+M hl j h 2 d „ =1 f(k(x)- 
n)h(x) 2 dn(x) ~ Ai(J Mtt) 2 . The method of Lagrange multiplicators sug- 
gests to take 

h(x) = , 

k(x) — r\ + a 

with a such that 

1 f d7r(a;) 



Ai J k(x) — rj + a 



19 



and c ; 



With this ft, we have 



(k(x) — rf)h(x) d7r(:r) — Ai( / /id7r) 



This is indeed the minimal ft when Ai is at least the harmonic mean 
A of k — inf(S). We can see it by using Cauchy-Schwarz: 

J(k(x) - r 7 )/i(x) 2 d7r(T) - Ai (/ ftdrr) 2 > f(R(x) - rj)h(x) 2 dir(x) - Ai (/(«(») - 77 + a)ft 2 d7r(x)) 



d7r(x) 
i(x) — r; + a 



In the case where Ai < A, we take a = 77 — inf(/t). This time we get 
/ (k(x) ~ J])h(x) 2 d7v(x) - Ai (/ hdn) 2 > / (k(x) — ri)h(x) 2 diY(x ) - A (/ (ii(x) ~rj + a)h 2 dn(x)) 



(/■ 

> -a. 



d7r(a:) 
;(x) — fj+e 



(A — Ai) f(hdir) 



A minimizing sequence hi[x) can be, for example, a sequence such that 
ft 2 tends to a Dirac at a point where the minimum of k is reached. 

In both cases, the exponential decay rate for zero-mean Lipschitz func- 
tions is at least Ai — a, then by density of the Lipschitz functions on L 2 (n) 
and by the reversibility assumption, the exponential decay rate for zero- 
mean L 2 (n) functions (which is equal to Ai) is also at least Ai — a. Thus 
a is nonnegative, which means that Ai is at least the harmonic mean of 
k — 77, so letting r\ tend to yields the result. □ 

Purely analytical methods can also be used to prove this result in the 
case A tJ — g 13 , and they also work when inf^ex k(x) = 0. 

Lemma 24 Let f be a regular enough (C 3 ) function from M to R. Then 
we have 



d_ 



||VP*/f = h(2L(h)+u h g kl ViA ij V i hu j )+h 2 



+A^R Uja g afi u fi g kl u k + 2u k g kl ViF i u i 



where h = ||V/|| and V k f = hu k . 

Proof: We have ±\ t=Q HVP'/f = 2V k fg kl V 1 (Lf), and 

V((L/) = i(V i A I3 V 2 3 / + ^Vf lJ /)+V ! F l V I / + F I V 2 I / 

= \{ViA^V 2 3 f + A«(V?-,/ + Ru ja g^Vpf)) + VjJ^Vi/ + F'V 2 ,/. 

Differentiating Vi/ = hut, we get V%f = Vihuj + ftVjUj, and Vy;/ = 
\7 2 jhui + VjhViUi + VihVjUi + h\7 2 jUi. So we get: 



df 



llVPVf = hg kl u k 



ViA ij (Vihuj + hViUj) 
Aij f Vijhui + VjhViUi + VihVjUi 

+2hViF i u i + 2F i (V i hui + hVm) 



Differentiating g kl u k ui = 1 gives g kl u k VjUi — and g kl (V \u k V jUi + 
u k W 2 jUi) — 0, so using these relationships, the above expression can be 
simplified to get the formula given in Lemma [Ml □ 
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Proof of Theorem [7j We first prove the Theorem in the case n' = oo 
and c = 0, in which case we get the result of Theorem [5] Indeed, in this 
case, the optimal p(x) is nothing but k{x) = inf^gT^Ai k(x,u). 

Let / be an eigenfunction of L for the eigenvalue — Ai. With the 
previous notation for h and u, we have: 

-2Ai||/i|| 2 L2w = £| t=0 l|VP7lli= W =f2h(x)L(h)(x)-2h(x) 2 K(x,g- 1 u(x)) 

-h{x) 2 A i i{x)g k \x)V i Uk{x)V j U l {x)&n(x) 
< -2Ai(/ h(x) 2 dn(x) - (/ h(x)dTr(x)) 2 ) 
-2 / K,(x)h(x) 2 dTv(x) + 

where the inequality J hL(h) < AiVar(/i) is due to the reversibility as- 
sumption. By Cauchy-Schwarz, we have 

( J h{x)dn(x)f < J J K{x)h{xfd-K{x). 



Finally we get: 



K(x)h 2 (x)dTv(x)(\ 1 [ - 1) > 0, 

J hi) 



and if J d J r / 3 P < +oo, then J n(x)h 2 (x) > 0, because / is nonconstant, so 



ft can't be almost everywhere. So we have 

Ai > 

— r d-K(x) 

In the general case, we have n' > n and the optimal p is given by: 

f \ 1 • f r> ■ / N , V72 (V^y) 2 

PW = t; Kicfu, uj + V u u w . 

2 uGT^jw.iMNi n' — n 

So we have p < k. Then with the previous notation, we have: 

Ai( f h(x)d-K(x)) 2 - / p(x)h 2 (x)dn(x) > 
./A'l J M 

because we have just shown the same with k instead of p. 
We also have 

f M r 2 (f)(x)d*(x) = (V/, V(L/))]d77 = + ^ J h 2 d<K 

Thus for any 9 £ [0, 1] we have: 

(l-0)Ai(/ hdn) 2 - ( (p-eX!— — -] ft 2 d7r> 0. 

For = 1, we have < J M (Xi^- - p)h 2 d7v < Ai^ - inf(p) /i 2 d7r, 
this proves the Bakery-Emery bound: 

Ai>^— inf(p). 
n — 1 
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So for any c G [0, inf(p)], we take 6 = (n ," 1 c )Ai G [0,1]. By Cauchy- 
Schwarz, we get 

(1 - 0)Ai j \p - c)h 2 dn J j^- ~ f(P~ c)h 2 dn > 
Thus we get (Ai — c n "_ 1 ) / — 1 > 0, which leads to the desired result. □ 
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